International Conference Recent Advances in Natural Language Processing

نویسندگان

  • P R O C E E D I N G S
  • Ruslan Mitkov
  • Leon Derczynski
چکیده

Part-of-Speech (POS) tagging is a key stepin many NLP algorithms. However, tweetsare difficult to POS tag because there aremany phenomena that frequently appear inTwitter that are not as common, or are en-tirely absent, in other domains: tweets areshort, are not always written maintainingformal grammar and proper spelling, andabbreviations are often used to overcometheir restricted lengths. Arabic tweets alsoshow a further range of linguistic phenom-ena such as usage of different dialects,romanised Arabic and borrowing foreignwords. In this paper, we present an evalu-ation and a detailed error analysis of state-of-the-art POS taggers for Arabic whenapplied to Arabic tweets. The accuracy ofstandard Arabic taggers is typically excel-lent (96-97%) on Modern Standard Arabic(MSA) text; however, their accuracy de-clines to 49-65% on Arabic tweets. Fur-ther, we present our initial approach to im-prove the taggers’ performance. By doingsome improvements based on observed er-rors, we are able to reach 79% tagging ac-curacy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Class Functions for Syntactic-Semantic Analysis

Appeared in Proceedings of the 2nd International Conference on Recent Advances in Natural Language Processing (RANLP’97), pp. 312–317, 1997. In this paper, Analysis with Word Class Functions (WCFA) is presented as a paradigm for syntactic-semantic analysis of natural language. The main characteristics of this approach are: word-orientation, the central role of word class functions, two phases o...

متن کامل

Natural Language Processing and Information Systems, 10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005, Alicante, Spain, June 15-17, 2005, Proceedings

natural language processing and information systems 10th international conference on applications of. Book lovers, when you need a new book to read, find the book here. Never worry not to find what you need. Is the natural language processing and information systems 10th international conference on applications of your needed book now? That's true; you are really a good reader. This is a perfec...

متن کامل

Natural Language Processing and Information Systems, 15th International Conference on Applications of Natural Language to Information Systems, NLDB 2010, Cardiff, UK, June 23-25, 2010. Proceedings

natural language processing and information systems 15th international conference on applications of. Book lovers, when you need a new book to read, find the book here. Never worry not to find what you need. Is the natural language processing and information systems 15th international conference on applications of your needed book now? That's true; you are really a good reader. This is a perfec...

متن کامل

Introduction: the International Conference on Intelligent Biology and Medicine (ICIBM) 2016: special focus on medical informatics and big data

In this editorial, we first summarize the 2016 International Conference on Intelligent Biology and Medicine (ICIBM 2016) held on December 8-10, 2016 in Houston, Texas, USA, and then briefly introduce the ten research articles included in this supplement issue. At ICIBM 2016, a special theme, "Medical Informatics and Big Data," was dedicated to the recent advances of data science in the medical ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011